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Abstract 

Most human tumors result from the accumulation of multiple genetic and epige- 
netic alterations in a single cell. Mutations that confer a fitness advantage to the cell 
are known as driver mutations and are causally related to tumorigenesis. Other muta- 
tions, however, do not change the phenotype of the cell or even decrease cellular fitness. 
While much experimental effort is being devoted to the identification of the different 
functional effects of individual mutations, mathematical modeling of tumor progres- 
sion generally considers constant fitness increments as mutations are accumulated. In 
this paper we study a mathematical model of tumor progression with random fitness 
increments. We analyze a multi-type branching process in which cells accumulate mu- 
tations whose fitness effects are chosen from a distribution. We determine the effect 
of the fitness distribution on the growth kinetics of the tumor. This work contributes 
to a quantitative understanding of the accumulation of mutations leading to cancer 
phenotypes. 
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1 Introduction 



Tumors result from an evolutionary process occurring within a tissue (Nowell, 1976). From an 
evolutionary point of view, tumors can be considered as collections of cells that accumulate 
genetic and epigenetic alterations. The phenotypic changes that these alterations confer 
to cells are subjected to the selection pressures within the tissue and lead to adaptations 
such as the evolution of more aggressive cell types, the emergence of resistance, induction 
of angiogenesis, evasion of the immune system, and colonization of distant organs with 
metastatic growth. Advantageous heritable alterations can cause a rapid expansion of the 
cell clone harboring such changes, since these cells are capable of outcompeting cells that 
have not evolved similar adaptations. The investigation of the dynamics of cell growth, 
the speed of accumulating mutations, and the distribution of different cell types at various 
timepoints during tumorigenesis is important for an understanding of the natural history 
of tumors. Further, such knowledge aids in the prognosis of newly diagnosed tumors, since 
the presence of cell clones with aggressive phenotypes lead to less optimistic predictions for 
tumor progression. Finally, a knowledge of the composition of tumors allows for the choice of 
optimum therapeutic interventions, as tumors harboring pre-existing resistant clones should 
be treated differently than drug- sensitive cell populations. 

Mathematical models have led to many important insights into the dynamics of tumor 
progression and the evolution of resistance (Goldie and Coldman, 1983 and 1984; Bodmer 
and Tomlinson, 1995; Coldman and Murray, 2000; Knudson, 2001; Maley and Forrest, 2001; 
Michor et al., 2004; Iwasa et al., 2005; Komarova and Wodarz, 2005; Michor et al., 2006; 
Michor and Iwasa, 2006; Frank 2007; Wodarz and Komarova, 2007). These mathematical 
models generally fall into one of two classes: (i) constant population size models, and (ii) 
models describing exponentially growing populations. Many theoretical investigations of ex- 
ponentially growing populations employ multi-type branching process models (e.g., Iwasa et 
al., 2006; Haeno et al., 2007; Durrett and Moseley, 2009), while others use population genetic 
models for homogeneously mixing exponentially growing populations (e.g., Beerenwinkel et 
al., 2007; Durrett and Mayberry, 2009). In this paper, we focus on branching process models. 
In these models, cells with i > mutations are denoted as type-i cells, and Zi(t) specifies the 
number of type-i cells at time t. Type-i cells die at rate bi, give birth to one new type-i cell 
at rate a^, and give birth to one new type-(i + 1) cell at rate Ui + \. In an alternate version, 
mutations occur with probability fii+i during birth events which occur at rate ctj. These 
two versions are equivalent provided u i+ \ = aifi i+ i and = aj(l — /ij+i). However, the 
relationship between the parameters must be kept in mind when comparing results between 
different formulations of the model. 

One biologically unrealistic aspect of this model as presented in the literature is that all 
type-i cells are assumed to have the same birth and death rates. This assumption describes 
situations during tumorigenesis in which the order of mutations is predetermined, i.e. the 
genetic changes can only be accumulated in a particular sequence and all other combinations 
of mutations lead to lethality. Furthermore, in this interpretation of the model, there cannot 
be any variability in phenotype among cells with the same number of mutations. In many 
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situations arising in biology, however, there is marked heterogeneity in phenotype even if 
genetically, the cells are identical (Elowitz et al., 2002; Becskei et al., 2005; Kaern et al., 
2005; Feinerman et al., 2008). This variability may be driven by stochasticity in gene 
expression or in post-transcriptional or post-translational modifications. In this paper, we 
modify the branching process model so that mutations alter cell birth rates by a random 
amount. 

An important consideration for this endeavor is the choice of the mutational fitness 
distribution. The exponential distribution has become the preferred candidate in theoretical 
studies of the genetics of adaptation. The first theoretical justification of this choice was 
given by Gillespie (1983, 1984), who argued that if the number of possible alleles is large 
and the current allele is close to the top of the rank ordering in fitness values, then extreme 
value theory should provide insight into the distribution of the fitness values of mutations. 
For many distributions including the normal, Gamma, and lognormal distributions, the 
maximum of n independent draws, when properly scaled, converges to the Gumbel or double 
exponential distribution, A(x) = exp(— e~ x ). In the biological literature, it is generally noted 
that this class of distributions only excludes exotic distributions like the Cauchy distribution, 
which has no moments. However, in reality, it eliminates all distributions with P(X > x) ~ 
Cx~ a . For distributions in the domain of attraction of the Gumbel distribution, and if 
Y± > Y2 • • • > Yfc are the k largest observations in a sample of size n, then there is a sequence 
of constants b n so that the spacings Zi = i(Yi—Y i+ i)/b n converge to independent exponentials 
with mean 1, see e.g., Weissman (1978). Following up on Gillespie's work, Orr (2003) added 
the observation that in this setting, the distribution of the fitness increases due to beneficial 
mutations has the same distribution as Z\ independent of the rank % of the wild type cell. 

To infer the distribution of fitness effects of newly emerged beneficial mutations, several 
experimental studies were performed; for examples, see Imhoff and Schlotterer (2001), San- 
juan et al. (2004), and Kassen and Bataillon (2006). The data from these experiments is 
generally consistent with an exponential distribution of fitness effects. However, there is an 
experimental caveat that cannot be neglected (Rozen et al., 2002): if only those mutations 
are considered that reach 100% frequency in the population, then the exponential distribu- 
tion is multiplied by the fixation probability. By this operation, a distribution with a mode 
at a positive value develops. In a study of a quasi-empirical model of RNA evolution in which 
fitness was based on secondary structures, Cowperthwaite et al. (2005) found that fitnesses 
of randomly selected genotypes appeared to follow a Gumbel-type distribution. They also 
discovered that the fitness distribution of beneficial mutations appeared exponential only 
when the vast majority of small-effect mutations were ignored. Furthermore, it was deter- 
mined that the distribution of beneficial mutations depends on the fitness of the parental 
genotype (Cowperthwaite et al., 2005; MacLean and Buckling, 2009). However, since the 
exceptions to this conclusion arise when the fitness of the wild type cell is low, these findings 
do not contradict the picture based on extreme value theory. 

In contrast to the evidence above, recent work of Rokyta et al. (2008) has shown that 
in two sets of beneficial mutations arising in the bacteriophage ID 11 and in the phage 06 
- for which the mutations were identified by sequencing - beneficial fitness effects are not 
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exponential. Using a statistical method developed by Biesal et al. (2007), they tested the 
null hypothesis that the fitness distribution has an exponential tail. They found that the 
null hypothesis could be rejected in favor of a distribution with a right truncated tail. Their 
data also violated the common assumption that small-effect mutations greatly outnumber 
those of large effect, as they were consistent with a uniform distribution of beneficial effects. 
A possible explanation for the bounded fitness distribution may be found in the culture con- 
ditions utilized in the experiments: they evolved ID11 on E.coli at an elevated temperature 
(37° C instead of 33° C). There may be a limited number of mutations that will enable 
ID 11 to survive in increased temperatures. The latter situation may be similar to scenarios 
arising during tumorigenesis, where, in order to develop resistance to a drug or to progress 
to a more aggressive stage, the conformation of a particular protein must be changed or 
a certain regulatory network must be disrupted. If there is a finite, but large, number of 
possible beneficial mutations, then it is convenient to use a continuous distribution as an 
approximation. 

In this paper, we consider both bounded distributions and unbounded distributions for 
the fitness advance and derive asymptotic results for the number of type-/c individuals at 
time t. We determine the effects of the fitness distribution on the growth kinetics of the 
population, and investigate the rates of expansion for both bounded and unbounded fitness 
distributions. This model provides a framework to investigate the accumulation of mutations 
with random fitness effects. 

The remainder of this section is dedicated to statements and discussion of our main 
results. Proofs of these results can be found in Sections [2]l5j 

1.1 Bounded distributions 

Let us consider a multi-type branching process in which type-2 cells have accumulated i > 
advantageous mutations. Suppose the initial population consists entirely of type-0 cells that 
give birth at rate a to new type-0 cells, die at rate b < a Q , and give birth to new type-1 cells 
at rate u\. The parameters do, bo, and u\ denote the birth rate, death rate, and mutation rate 
for type-0 cells. To simplify computations, we will approximate the number of type-0 cells by 
Zo(t) = Voe Xot , where Xo = a — b > 0. If the initial cell population Z (0) = Vo ^> 1/Ao, then 
the branching process giving the number of 0's is almost deterministic and this approximation 
is accurate. When a new type-1 cell is born, we choose x > according to a continuous 
probability distribution v. The new type 1-cell and its descendants then have birth rate 
a + x, death rate bo, and mutation rate u<i- In general, type-fc cells with birth rate a mutate 
to type- (A; + 1) cell at rate u^+i and when a mutation occurs, the new type- (A; + 1) cell and 
its descendants have an increased birth rate a + x where x > is drawn according to v. We 
let Zk(t) denote the total number of type-fc cells in the population at time t. When we refer 
to the kth generation of mutants, we mean the set of all type-fc cells. 

We begin by considering situations in which the distribution of the increase in the birth 
rate is concentrated on [0,6]. In particular, suppose that v has density g with support in 
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[0, b] and assume that g satisfies: 



(*) g is continuous at b, g{b) > 0, g(x) < G for x G [0, b] 

Our first result describes the mean number of first generation mutants at time t, EZ\(t). 

Theorem 1. If(*) holds, then 

EZ 1 {t) ~ 

n ; bt 

where a(t) ~ b(t) means a(t)/b(t) — > 1. 

The next result shows that the actual growth rate of type-1 cells is slower than the mean. 
Here, and in what follows, we use =>■ to indicate convergence in distribution. 

Theorem 2. If (*) holds and p = b/Xo, then for 9 > 0, 

Eexp(-^ 1+p e- (Ao+b)i Z 1 (t)) exp(-K)Mi0 Ao/(Ao+fe) c 1 (Ao,6)), (1.1) 

where Ci(Xo,b) is an explicit constant whose value will be given in ( 13 .Sp . In particular, we 
have 

where V\ has Laplace transform given by the righthand side of ( II. ip . 

Theorem [2] is similar to Theorem 3 in Durrett and Moseley (2009) which assumes a deter- 
ministic fitness distribution so that all type-1 cells have growth rate Ai = Ao + b. There, 
the asymptotic growth rate of the first generation is exp(Ait). In contrast, the continuous 
fitness distribution we consider here has the effect of slowing down the growth rate of the 
first generation by the polynomial factor t 1+p . To explain this difference, we note that the 
calculation of the mean given in Section [3] shows that the dominant contribution to Zi(t) 
comes from growth rates x = b — 0(l/t). However, mutations with this growth rate are un- 
likely until the number of type-0 cells is 0(t), i.e., roughly at time r\ = (1/A ) logt. Thus at 
time t, the number of type-1 cells will be roughly exp((A + b)(t — r x )) = exp((A + b)t)/t 1+p . 

To prove Theorem [21 we look at mutations as a point process in [0,t] x [0,6]: there is 
a point at (s,x) if there was a mutant with birth rate a + x at time s. This allows us to 
derive the following explicit expression for the Laplace transform of Z\{t\. 

E{e- ez ^) = exp dxg(x) jf ds V e Xos (l - &,, t _,(0))) 

where (f> x ,r{6) — Ee~ ez * and is a continuous-time branching process with birth rate a^+x, 
death rate b , and initial population Z£ = 1. In Figure (U we compare the exact Laplace 
transform of t 1+p exp(—(Xo+b)t)Zi(t) with the results of simulations and the limiting Laplace 
transform from Theorem [21 illustrating the convergence as t — > oo. 
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Notice that the Laplace transform of V\ has the form exp(C 9 a ) where a = Ao/(Ao + 
b) which implies that P(V\ > v) ~ v~ a as v — > oo (see, for example, the argument in 
Section 3 of Durrett and Moseley (2009)). To gain some insight into how this limit comes 
about, we give a second proof of the convergence that tells us the limit is the sum of points 
in a nonhomogeneous Poisson process. Each point in the limiting process represents the 
contribution of a different mutant lineage to Zi(t). 

Theorem 3. V\ = lim^oo t 1+p e _ ^ Ao+ ^'Zi(t) is the sum of the points of a Poisson process 
on (0, oo) with mean measure fi(z,oo) = Ai(X ,b)uiVoz~ x °^ Xo+b \ 

A similar result can be obtained for deterministic fitness distributions, see the Corollary to 
Theorem 3 in Durrett and Moseley (2009). However, the new result shows that the point 
process limit is not an artifact of assuming that all first generation mutants have the same 
growth rate. Even when the fitness advances are random, different mutant lines contribute 
to the limit. This result is consistent with observations of Maley et al. (2006) and Shah et 
al. (2009) that tumors contain cells with different mutational haplotypes. Theorem [3] also 
gives quantitative predictions about the relative contribution of different mutations to the 
total population. These implications will be explored further in a follow-up paper currently 
in progress. 

With the behavior of the first generation analyzed, we are ready to proceed to the study 
of further generations. The computation of the mean is straightforward. 

Theorem 4. If (*) holds, then 

As in the k = 1 case, the mean involves a polynomial correction to the exponential growth 
and again, does not give the correct growth rate for the number of type-fc cells. To state the 
correct limit theorem describing the growth rate of Z k (t), we will define p k and U\ k by 



fc + ft = ^ and = ]pV(Ao + (;-i) 6 ) 



U 3 



for all k > 1. 

Theorem 5. // (*) holds, then for 9 > 

Eexp{-6t k+Pk e- {Xo+kb)t Z k {t)) exp(-c fc (A , b)VoU ltk 6 Xo/{Xo+kb) ) 



t k+Pk e~ {Xo+kb)t Z k (t) => V k 



k 



We prove this result by looking at the mutations to type-1 individuals as a three di- 
mensional Poisson point process: there is a point at (s, x, v) if there was a type-1 mutant 
with birth rate a + x at time s and the number of its type-1 descendants at time t, Zl' x (t), 
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has e' {Xo+x){t - s) Zl' x (t) v with v > 0. To study Z k (t) we will let Z s k ' x ' v (t) be the type-A; 
descendants at time t of the 1 mutant at (s,x, v). Z s k ,x,v is the same as a process in which 
the initial type (here type-1 cells) behaves like ve ( - Xo+x ^ t ~ s,> instead of Zo(t) = Voe Aot , so the 
result can be proved by induction. 

To explain the form of the result we consider the case k = 2. Breaking things down 
according to the times and the sizes of the mutational changes, we have 

j*b i*b r-t f-t 

EZ 2 (t) = / dxig(xi) / dx 2 g(x 2 ) / dsi / ds 2 
Jo Jo Jo J s± 

Voe A ° Sl Uie^ Ao+:El - )( - S2_Sl - ) U2e^ Ao+:Cl+:r2 ' )( -* _S2 - ) 

As in the result for Zi(t) the dominant contribution comes from Xi,x 2 = b — 0(l/t) and as 
in the discussion preceding the statement of Theorem [2J the time of the first mutation to 
b — 0(1 /t) is rs r\ = (logt)/Ao- The descendants of this mutation grow at exponential rate 
X + b — 0(1 /t), so the time of the first mutation to 2b — 0(1 /t) is w r 2 = n + (logt)/(Ao + 6). 
Noticing that 

exp((A + 2b) (t - n - r 2 )) = exp((A + 2b)t)r^ 0+2b)/x °-^ 0+2b)/(Xo+b) 

tells us what to guess for the polynomial term: t~( 2+P2 ^ where 

A + 26 A + 26 

2 + P2 = 7 + r , , 

A A + b 

In Figure [2j we compare the asymptotic Laplace transform from Theorem [5] with the 
results of simulations in the case k = 2. To explain the slow convergence to the limit, we 
note that if we take account of the mutation rates u±,u 2 in the heuristic from the previous 
paragraph (which becomes important when Ui,u 2 are small), then the first time we see a 
type-1 cell with growth rate b — 0(1/ 1) will not occur until time Aq 1 \og(t/u\) when the 
type-0 cells reach 0(t/u\) and so the first type-2 cell with growth rate 2b — 0(l/t) will not 
be born until time r = AQ 1 log(t/ui) + (Ao + b)^ 1 \og(t / u 2 ) when the descendants of the 
type-1 cells with growth rate b — 0(l/t) reach size 0(t/u 2 ). When u\ = u 2 = 10~ 3 , Ao = .1, 
and b = .01, r w 223. The mutations created at this point will need some time to grow and 
become dominant in the population. It would be interesting to compare simulations at time 
300, but we have not been able to do this due to the large number of different growth rates 
in generation 1. 

1.2 Unbounded distributions 

Let us now consider situations in which the fitness distribution is unbounded. Suppose that 
the fitness increase follows a generalized Frechet distribution, 

P(X > x) = x^e^ xa (1.2) 
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for some positive 7, a and any 6 R. There is a two-fold purpose for considering such 
distributions. First, if i.i.d. random variables Cii • • • j Cn have a power law tail, i.e. P(Q > 
y) ~ cy~ a as y — >■ 00, then their maxima and the spacings between order statistics converge to 
a limit of the form (11. 2D with (5 = 0. Second, this choice allows us to consider the gamma(/3 + 
1,7) distribution which has a = 1 and the normal distribution, which asymptotically has 
this form with a = 2, (3 = — 1. 

To analyze this situation, we will again take a Poisson process viewpoint and look at 
the contribution from a mutation at time s with increased growth rate x. A mutation that 
increases the growth rate by x at time s will, if it does not die out, grow to e ( x o+x)(ts) ^ a t 
time t where ( has an exponential distribution. The growth rate (Ao + x){t — s) > z when 

z 

x > A . 

t — s 

Therefore, 

fi(z, 00) = E(# mutations with (A + x)(t — s) > z) 

= ^ jf - A ») ' eA " «p (rh - ) * 

= V U! i'j~^~^o) exp(S(s,z)) ds 



s,z) = X s - 7 ( — A ) . (1.3) 



where 

Sis. z) = AnS — t I 

x t - s 

The size of this integral can be found by maximizing the exponent S over s for fixed z. Since 
and 



d S is ' z) = x °- ai {—s- Xo ) jny (L4) 



— (s, *) = -«(« -1)7^- A J - ory^— - A„J 7^3-^ (1.5) 

we can see that d 2 S/ds 2 (s, z) < when > Ao(t — s) so that for all z in this range, S(s, z) 
is concave as a function of s and achieves its maximum at a unique value s z . 

When a = 1, it is easy to set HI .41) to and solve for s z . This in turn leads to an 
asymptotic formula for fi(z, 00) and allows us to derive the following limit theorem for Z\(t). 

Theorem 6. Suppose a = 1 and let cq = Ao/47. Then t~ 2 log Z\{t) — > cq and 

1 



t 



1 7 , f 1 l (20 + 1) log* 

logZi(t) - cot 1 + - 



y 



A t 

where y* is the rightmost point in the point process with intensity given by 

(2c ) /3 (7r/A ) 1/ V oMl exp( 7 A - X y/2c ). (1.6) 
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When a ^ 1, solving for s z becomes more difficult, but we are still able to prove the 
following limit theorem for Zi(t). 

Theorem 7. Suppose a > 1 is an integer. There exist explicitly calculable constants Ck = 
Ck(ot, 7), < k < a, and k = k(/3, a, 7) so that £-( a+1 )/ a log Z\(t) — > Co and 



t l ' a 



logZ 1 (t)-c ^ +1 )/ a U+Y, Ckt ~ 

\ l<k<a 



where y* is the rightmost particle in a point process with explicitly calculable intensity. 

The complicated form of the result is due to the fact that the fluctuations are only of order 
so we have to be very precise in locating the maximum. The explicit formulas for the 
constants and the intensity of the point process are given in (15.121) and (I5.13p . With more 
work this result could be proved for a general a > 1, but we have not tried to do this or 
prove Conjecture 1 below because the super-exponential growth rates in the unbounded case 
are too fast to be realistic. 

We conclude this section with two comments. First, the proof of Theorem [7] shows that in 
contrast to the bounded case, in the unbounded case, most type-1 individuals are descendants 
of a single mutant. Second, the proof shows that the distribution of the mutant with the 
largest growth rate is born at time s ~ t/(a+l) (see Remark [T] at the end of Section [5]) and 
has growth rate z = 0(t^ a+1 ^ a ). The intuition behind this is that since the type-0 cells have 
growth rate e A ° s and the distribution of the increase in fitness has tail ~ e~ lxa , the largest 
advance x attained by time t should occur when s = 0(t) and satisfy 

gCAotg-T*" = or x = (t 1/a ). 

The growth rate of its family is then (Ao + x)(t — s) = 0(t^ a+1 ^ a ). 

Since the type-1 cells grow at exponential rate cit^ a+1 ^ a , if we apply this same reasoning 
to type-2 mutants, then the largest additional fitness advance x attained by type-2 individuals 
should satisfy 

e Cl ' (Q+1)/ V^ Q = 0(1) or x = 0{t l l a+l ' a2 ). 

and the growth rate of its family will be 0(t 1+1 / a+1 / a2 ). Extrapolating from the first two 
generations, we make the following 

Conjecture 1. Let q(k) = ^J =0 a~ J . As t — >■ oo ; 



J 3- 

1 



log Z k (t) c k 



Note that in the case of the exponential distribution, q(k) — k + 1. 

The rest of the paper is organized as follows. Sections GH5] are devoted to proofs of our 
main results. After some preliminary notation and definitions in Section El Theorems [IH 
are proved in Section [3j Theorems 0H5] in Section HI and Theorems [6H3 in Section [5) We 
conclude with a discussion of our results in Section [6j 
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2 Preliminaries 

This section contains some preliminary notation and definitions which we will need for the 
proofs of our main results. We denote by J\f(t) the points in a two dimensional Poisson 
process on [0,t] x [0, oo) with mean measure 

V e XoS dsis(dx) 

where in Sections |3]H1 v(dx) = g(x)dx with g satisfying (*) and in Section |5l v has tail 
v(x, oo) = x^e~ lx . In other words, we have a point at (s,x) if there was a mutant with 
birth rate a$+x at time s. Define a collection of independent birth/death branching processes 
Zl' x (t) indexed by (s,x) G J\f(t) with Z{ ,x (s) = 1, individual birth rate a + x, and death 
rate b. Z[ ,x (t) is the contribution of the mutation at (s,x) and 

z x (t)= ^(*)- 

(s,x)eM(t) 

It is well known that 

e - {Xo+x){t - s)z s X{t) ^ _J_ 5q + h±^ ( 

a + x a + x 

where ( ~ exp((A + x)/(a + x)) (see, for example, equation (1) in Durrett and Moseley 
(2009)). In several results, we shall make use of the three dimensional Poisson process M.{t) 
on [0,t] x [0, oo) x (0, oo) with intensity 

V e XoS v(dx) ( *°±^V e -v[Xo+x)/(ao+x) dv _ 

\a + x J 

In words, (s,x,v) G M.{t) if there was a mutant with birth rate ao + x at time s and the 
number of its descendants at time t, Zl' x (t), has Zl' x (t) ~ ve i x o+x)(t-s) _ j g a j go conven i en t 
to define the mapping z : [0, oo) x [0,t] —> [0, oo) which maps a point (s,x) G ftf(t) to the 
growth rate of the induced branching process if it survives: z(s, x) = (A + x)(t — s) and let 

fi(A) = E\{(s,x) G Af(t) : z(s,x) G A}\ 

for A C [0, oo). 

We shall use C do denote a generic constant whose value may change from line to line. 
We write f(t) ~ <?(t) if f(t)/g(t) -)■ 1 as t ->• oo and /(t) = o((?(t)) is /(*)/<?(*) 0. 
/(i) 3> (<C)#(t) means that f(t)/g(t) — > oo ( resp. 0) as i — >• oo and /(t) = 0(g(t)) means 
|/(t)| < C#(t) for all t > 0. We also shall use the notation f(t) ~ #(t) if f(t) = g(t) + o(l) 
as t — 7- oo. 
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3 Bounded distributions, Zi 

In this section, we prove Theorems CD - [3j 

Proof of TheoremUi Mutations to l's occur at rate Voe A ° s so 

•t rb 



EZ l (i)=u l ! I e {t ~ s){Xo+x) g{x) dxV e x ° s ds 
Jo Jo 

= Ul V e Xot [ dxg{x) [ e (t - s)x ds (3.1) 



Ul V e Xot I dxg(x) 6 ~ ' 



x 



We begin by showing that the contribution from x G [0, b — (1 + k) logt)/t] can be ignored 
for any k G [0, oo). The Mean Value theorem implies that 

tx _ 1 

< te tx (3.2) 

x 

Using this and the fact that J^te tx dx < e td for any c < d, we can see that 

r b-(l+k)Qogt)/t tx _ 1 

t k e~ bt / dx g(x) < Gt k e- {l+k ^ -> (3.3) 

Jo x 

To handle the other piece of the integral, we take k — 1 and note that 



I 



dx g(x 



6-(21ogt)/t x Jb—2\ogt/t 

After changing variables y — (b — x)t, dx = —dy/t, the last integral 

r21ogt 



1 Z-^iogt 

- / e- y dy ~ 1/t 
* Jo 



which proves the result. □ 
The above proof tells us that the dominant contribution to the l's come from mutations 
with fitness increase x > b t = b — 2\ogt/t. To describe the times at which the dominant 
contributions occur, let S(t) = (2/6) log log t. Then the contribution to the mean from 
x G [bt, b] and s > S{t) is by (13~TD 

< Gu.Voe^t^V f°° e sh ds 
t Js(t) 

tbt 

Since b t S(t) > 2 log log t, this quantity is o(t~ 1 e^ Xo+b ^ t ). In words, the dominant contribution 
to the mean comes from points close to (0, b) or more precisely from [0, (2/6) log log t] x [6 — 
(21og/)7.6. 
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Proof of Theorem [H It suffices to prove (11. ip . The computation in (13.31) with k = 2 + p 
implies that the contribution from mutations with x < b t = b — (3 + p) (log t)/t can be 
ignored. Therefore, we have 

Eex V (-eZ 1 (t)) ~ E (exp(-0£i(*)); A t ) 

where A t = {(s,x) G Af(t) : a; > b t }. By Lemma 2 of Durrett and Moseley (2009), we have 



E( e - 6Zl ®;A t ) = exp (-u x I dxg(x) [ dsV e x ° s (l 

V Jbt Jo 



4>x,t-s(0)) 



where <ft x ,r(@) — Ee ez * and Z® is a birth/death branching process with birth rate an + x, 
death rate bo, and initial population Zq = 1. Using 

e -(X +b)t _ e -(\ a +x){t-s) e -{\ +x)s e -(b-x)t ("& A) 

we have 

ft 



A s 



£ (exp(-^ 1 (t)e-* (Ao+6) t 1+p ); A t ) = expf-ni^ / dx g(x) [ ds 

V Jbt Jo 

{1 - 4,^ s (^ (Ao+x,)( *- s) e-( Ao+ ^ s e- (b - :c) *t 1+p )} 

Changing variables s = r x + r where r x = log(t 1+p ) on the inside integral, y = (b — x)t, 
dy/t = —dx on the outside, and continuing to write x as short hand for b — y/t, the above 

*(3+p) logi 



/ f (3+p)logt , 

expf -^Vojf fg(x)t^ 



Ao/(Ao+ir) 



t—r<c 



dre Xor {l - ^ t _ r _ rx (ee- iXo+x)it - r - rx) e- iXo+x)r e- y )} ) (3.5) 



Formula (20) in Durrett and Moseley (2009) implies that as u — > oo, 

1 - fa u (0e-^ u ) • ^— (3.6) 

CLQ+X 

and therefore, letting t — > oo and using (l+p)\ /(X + b) = 1, we can see that the expression 
in (GOD) 

A ° + b r a, ^ 9e-^ r e-y 



->■ exp ( -uiV g(b) f dy ^ + b [ dr e y 
\ Jo ao + b y_ 00 



Changing variables r = ;^;{a + log[#e y (a + b)/(X + b)]}, dr = dqj (Ao + b) gives 

\ a o + o/ Jo 

d< i 3 <?Ao/(A +fe). e ~ 9 \ 



An + b e~i + 1 
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To simplify the first integral we note that 

dy e 

For the second integral, we prove 
Lemma 1. IfO < c < 1 



y\ Q /{\ +b) = A + ft 

A 



dqe qc -^—- = T{c)T(l-c) (3.7) 
e \ y 



Proof. We can rewrite the integral as 

dqe qc I dx e~ x e~ q exp(—e~ q x) 



so that after interchanging the order of integration and changing variables w = e q x, dw 
—dqe~ q x so that w/x = e~ q , dw/x = —dqe~ q , we have 

roo i roo poo 

dx / — (w/x)- c e- x e- w = / dxx- 1+c e- x dww- c e~ w 



x 

which is = T(c)r(l — c). □ 
Taking c = A / (A + b) and letting 

\ i 7 i / _|_ 7 \ — 6/(Ao+b) 

ci(A , b) = g(b)^ ■ — - ^ r(A /(A + 6))r(l - A /(A + b)) (3.8) 

A A + b \\ + b) 

we have proved Theorem [2j □ 

Recall that we have assumed Zo(t) = Voe Xot is deterministic. This assumption can be 
relaxed to obtain the following generalization of Theorem |2] which is used in Section HJ 

Lemma 2. Suppose that Zo(t) is a stochastic process with Z Q (t) ~ e Xot V for some constant 
Vq as t — > oo. Then the conclusions of Theorem^ remain valid. 

To see why this is true, we can use a variant of Lemma 2 from Durrett and Moseley 
(2009) to conclude that 

E (e- ez ^\F?) = exp (- Ul jT dxg(x) dsZ (s) (l - ^ s {9) 

where is the cr-field generated by Zq(s) for s < t. Therefore, 

E (e- eZl(t) ) = Eexp f- Ul jT dxg(x) J dsZ (s) (l - ^ t - s (9) 
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Given e > 0, we can choose t e > so that 

Z (t) 



V exp(X t) 



< e 



for all t > t e . Since the contribution from t <t £ will not affect the limit and the term inside 
the expectation is bounded, the rest of the proof can be completed in the same manner as 
the proof of Theorem [2J 

We conclude this section with the 

Proof of Theorem 0. Let M. (t) be the three dimensional Poisson process defined in Sec- 
tion [2j Using (13 .4p . we see that in order for the contribution of Z{ ,x {t) to the limit of 

t 1 +Pe~( Xo+b ^ t Z 1 (t) to be > z we need 

V > zr (l+P) e (^)* e (Ao+*)« 

Therefore, the expected number of mutations that contribute more than z to the limit is 

Ul V C dxgix) f dse Xos exp ( '_*°±E . zt -(i+ P ) e (b-x-)t e (x 0+ *) t 

Jo Jo a + x \ a + x 

In order to turn the big exponential into e~ r we change variables: 

1 



log 



A + x ta I z t-(i+p) e ( b -^^±2 



ao+x 



ds = dr/r(X Q + x) to get 

Ul Vo dxg(x)z- Xo ^ +x U^-^) . t (l+p)Ao/(Ao+*) 

Jo \a + xj 

e -(6-a;)tAo/(Ao+aO / r -x/(X +x) c -r 
Ja(x,t) Ao + X 

where a(x,t) = zt~ {l+p) e {b - x)t (\ + x)/(a + x) and /3(x,t) = a(x, t)e {Xo+x)t . As in the 
previous proof, the main contribution comes from x G [b t , b] so when we change variables 
y — (b — x)t, dx = —dy/t, replace the x's by fe's and use 1 = (1 +p)A /(A + b) we convert 
the above into 

g(6)z -V(Ao+6) niVp / Ap + b \ / dye -yX / { X 0+b ) r -b/(X 0+ b) e ~r ^ 

A + b \a + b) J Jo 

Performing the integrals gives the result with 

1 /\ _|_ h\ b /( X 0+b) 

A 1 (\ ,b)=g(b)-(^-) r(A /(A + 6)) □ 

A \a + b J 
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4 Bounded distributions, Zk 



We now move on to the proofs of Theorems H] and [5] Recall that we have defined p k by the 
relation 

k + v , - V Ao + fc& 

Proof of Theorem Breaking things down according to the times and the sizes of the 
mutational changes we have 



EZ k (t) = / dx 1 g(x 1 )--- / dx k g(x k ) / ds x --- I ds k 

JO JO Jo Jsk-i 



dx x g{x 1 )--- \ dx k g(x k ) / dsf - I ds k 

Jo Jo J s fc _i 

Vbui ■ • • u k e Xot e Xl{t ~ Sl) ■ ■ ■ e Xk{t - Sk) . (4.1) 

The first step is to show 

Lemma 3. Let b t = b — (2/c + l)(log£)/£. The contribution to EZ k (t) from points (x\, . . .x k ) 
with some Xi < b t is o(t~ 2k : e ( A °+ fcb )*) _ 

Proof. ( 13. 2 p implies that 

i p (xj-\ NEfc)(^ — s j— l) 1 



Applying this and working backwards in the above expression for EZ k {t), we get 

EZ k (t)<t k V u 1 ---u k [ dx 1 g(x l )--- [ dx k g{x k )e^+^ + - +x ^ 

Jo Jo 

and the desired result follows. □ 
With the Lemma established, when we work backwards 

ds ■ e ( x J+-+ x k)(tSj) = 



3-1 



Xj H h x k (k - j + 1)6 



From this and induction, we see that the contribution from points (xi, . . . x k ) with X{ G [bt, b] 
for all i is 

~ V ° Ul m Uk 9{b)k f dXl ■■■ f d ^ e(Xo+xl+ - +Xk)t 
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Changing variables yi = t(b — Xi) the above 

_ v o u i- ■■ u k9(b) k ciXo+ kb)t 
~ bH k k\ 

which proves the desired result. □ 
In the proof of the last result, we showed that the dominant contribution comes from 
mutations with Xi> b t . To prove our limit theorem we will also need a result regarding the 
times at which the mutations to the dominant types occur. 

Lemma 4. Let a k = ^r^- The contribution to EZ k {t) from points with S\ > a k logt is 

Q ^--2k e (\ +kb)ty 



Proof. Replace the x^'s in the exponents by fe's, we can see from ( 14. ip that the expected 
contribution from points with s\ > a> k log t is 



< b k G k V oUl ■ ■ ■ u k ! ds! ! ds 2 • • • / ds k e Xot e h{t - Sl) ■ ■ ■ e b{t ~ Sk) 

< Ce Xot I e kb{t - sl) d Sl 

J ai~ log t 



I a k logi 

and the desired result follows. □ 
Recall that 

k I P, V A + fe& 
™ i Pk 7 > • 

U Xo+3b 

For the induction used in the next proof, we will also need the corresponding quantity with 
Ao replaced by Ao + x and k by k — 1 

fc-i + ^w=E A °t" +( l" 1)6 

^ A + x + j6 

which means 

/ \ (A; - 1 -j)b 
^ A + x+j6 

The limit will depend on the mutation rates through 

k 



_ TT ,>/(Ao+(j-l)b) 
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Again we will need the corresponding quantity with k — 1 terms 

«2, fc (x)=n^ )/(Ao+ ^- i)6) . 

3=1 

We shall write U2,k = U2,k(b) and note that 

= «i%;/ (Ao+6) (4-2) 

Proof of Theorem We shall prove the result under the more general assumption that 
Z Q (t) ~ Voe A °* for some constant Vq. The result then holds for k = 1 by Lemma [2j We shall 
prove the general result by induction on k. To this end, suppose the result holds for k — 1. 
Let Z^ x ' v (t) be the type-fc descendants at time t of the 1 mutant at (s, x,v) G Ai(t). Since 
Zl' x (t) ~ ve (>^o+x)(t-s) com pared to Zo(t) ~ Voe Xot , it follows from the induction hypothesis 
that 

Eexp(-6(t - s )*-i+p fc -i(x) e -(Ao+x+(fc-i)6)(t-»)^,t»( f )j 

->• exp(-c fc _ 1 (A + ac, 6)v« 2ifc (a;)0( Ao+a,) ^ Ao+ * + ( fc - 1 ) 6 J) (4.3) 
Integrating over the contributions from the three-dimensional point process we have 

Eexp(-6Z k (t)) = exp^-^ dx g(x) J dsuiV Q e x ° s 

/ A + x\ 2 / A + a; \ k _ x 



dv ^— exp --^—-v (1 - 
\a + x J \ a + x J 

where <f>x~v\- a (^) = E exp(—8 Zl' x ' v (t — s)). To prove the desired result we need to replace 9 
by Qt k+Pk e~( Xo+kb ^ . Doing this with H4.3[) in mind we have 



E exp(-9t h+Pk e- (Xo+kb)t Z k (t)) 

= exp( — / dx g(x) f dsuiV e x ° s f dv ( — J exp 



A + x 
-v 



do + x 

_ ^fe-l ^^+Pfc e -(A +cc+(fe-l)&)(t-s) e -(6-a;)t e -(Ao+a:+6(fc-l))s^| 

By Lemmas |3] and HI we can restrict attention to x 6 [&t, &] and s < afclogi. The first 
restriction implies that all of the x's except the one in (b — x) can be set equal to b and the 
second that we can replace t by t — s. Since (k + p k ) — (k — 1 +pk-i(b)) = (Ao + kb)/\o, the 
term in the exponential is 

— — I dx g{x) I dsuiVoe x ° s [ dv [— — — — J exp ( — ^° ^ v 



a o + b J \ ao + b 

(l _ (f> xvt _ 8 (6(t - s ^ k - 1 +Pk-i(b) e -(^o+kb)(t-s) t (\o+kb)/Xo e -(b-x)t e -(X +kb)s^ 
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Changing variables s = R(t) + r where R(t) = (1/Ao)(log£), and y = {b — x)t, dy = —tdx 
the above becomes 

= -9(b) / dy dv — exp —v 



a + b J \ a + b 

a fe log t-R(t) 

dr Ul V e Xor (1 - ^'^Mt - s) k ~ 1+Pk ' l{b) e-^ 0+kb) ^ s) e- v e- {Xo+kb)r )) 

-R(t) 

Using ( 14. 3 p now we have that the 1 — term converges to 

1 - exp (-c fc _i(A + b,b)vu 2ik [ee- y ]^ 0+b ^ Xo+kb ^e-^ 0+b >) 
To simplify the exponential we let 

r = — ^7(3 + Q(v,y)) where Q(v,y) = log {c fc _ 1 (A + b,b)vu 2 . k [0e-^ Xo+b ^ +kb ^} 

Aq + 

dr = dq/ (A + b). Plugging this into e x ° r results in 

e 9^/(^o+6)( Cfc _ 1 (A + b, b)vu2,k) Xo/{Xo+b) Xo/{Xo+kb) e~ yXo/{Xo+kb) 
so the exponential converges to 



-g(b) Cfc - l(A ° ^ o b f b Xo/iXo+b) ^4 /(Ao+6) g W(A ° w) 



\a + b J \ a + b 

/°° dn 
— ±- e qX °^ Xo+b \l - exp(-e- 9 )) 

To clean this up, we note that letting w = v(Xq + b) / (a + b), dw = dv(Xo + b) / (a + b) 

2 



dv [ *°±£ ) v x ^ x ^exp [ -*°±i, 
a + b ) \ a + b 



, i,\ -l+V(Ao+&) 

f^J r(l + Ao/(A + 6)) (4.4) 



The second integral is easy: 



Ao 



The third one looks weird but when you put x = e q , dx = —e q dq, or dq = —dx/x it is 

dxx-^'^^il-e-^dx 



then integrating by parts f(x) = 1 — e x , g'(x) = x 1 -V( A o+fc) ; f'(x) = e x , g(x) = 
x -Ao/(A () +fe)(_x o + fryx turns it into 

^±^r(l - A /(A + 6)) (4.6) 
Putting this all together and using (I4.2p . we have 

c*-i(Ao + b, b)^+V . g (b)h±]± . Vo u ltk 9^ + k b) 

An 



1 / ao + & N- 1 +W(Ao+6) 



A \X + b 



r(l + Ao/(Ao + &))r(l-Ao/(A + 6)) 



Setting Cfc(A , b) equal to the quantity in the last display divided by V Ui jk 9 x °^ Xo+kb ^ we have 
proved the result. □ 

To work out an explicit formula for the constant and to compare with Durrett and 
Moseley (2009), it is useful to let Xj = Aq + jb, aj = a Q + jb and 



1 fa 



-1+Aj_i/Aj 



c h ,j = ^—(y) t ( 1 + Vi/A,)r(i - \j_xf\j) 



From this we see that 

hi) =r.v J\, h) x °/ Xl n(h) 

Ao 

An / 



a(Ao./>) =r,,- l (A l ,6) A °/ A ^(6)^c M 

/ A \ A(, / Al A 

c k - 2 (\2,b) x » /x > ■ (g{b)^c h A -^(6)^c M 



and hence 



c fc (A ,&)=nu( & ) 



— / ^ \ x °/ x 



k-j+1 



Aj_l 



In Durrett and Moseley (2009) if we let J~k-i be the cx-field generated by Zj(t) for j < k 
and alH > then 

S(e-^|7- fc _i) = exp(-u k V k ^c h , k 6 Xk -^ Xk ) 

Iterating we have 

E(e- 9Vk \F k „ 2 ) = E(exp(- Uk V k ^c h , k 6 x x-^ x «)\F k _ 2 ) 

= exp (-^-i^^^^^^c^xc^- 2 ^- 1 ^- 2 ^) 

and hence 

E(e- ev *\V ) = exp(-c e , k V u 1:k e Xo ^) 

where c 9 , k = Uj=i c> h,j ■ 
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5 Proofs for unbounded distributions 

In this Section, we prove Theorem [7J The first step is to show that unlike in the case of 
bounded mutational advances, for unbounded distributions, the main contribution to the 
limit is given by the descendants of a single mutations. The largest growth rate will come 
from z = 0(t^ a+1 ^ a ) so the next result is enough. Recall that the mean number of mutations 
with growth rate larger than z has 

fi(z, oo) = V^i J (r^ ~ e Xos exp (-<y ~ A ^ ^ ds 

t / _ \ & 



V ui I ( — - - A ) exp(0(s, z)) ds 





where is as in (11. 3p . 
Lemma 5. Let z > Xot. Then 

E [ J2 Z i X ^) ) ^ CV oUl ze Xot+ ~ z 

\(s,x):z(s,x)<z J 

as t — y 00. 

Proof. The expected number of individuals produced by mutations with growth rates < z is 

V oUl [ ° e Xos ■ y?e-™ a ■ e z{s ' y) dy ds. 

Jo Jo 

Changing variables y !->■ u = z(s, y), that is y — u/(t — s) — Xq, dy = dujit — s), and using 
Fubini's theorem to switch the order of integration, we can see that the above is 

< Vo Ul e Xot+2 f l\u/{t -s)- X Q f exp (-7 ( - X ) ) du. (5.1) 



jo 



t — s J J (t— s 



But then if we change variables s \-> r = u/(t — s) — Xq, dr = uds/{t — s) 2 , we can see that 
the inner integral is 

/°° r P 
-e-^dr < C 
Ao r + Xo 

yielding the desired bound. □ 

To motivate the proof of the general result, we begin with the case when a = 1. 
Proof of Theorme 0. Since 

z 1 (t)= z i x w= E E z i x w 

(s,x)£Af(t) (s,x):z(s,x)<z (s,x):z(s,x)>z 
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for any z > 0, we have 



- log Zt(t) ~ - 



log 



zr{t) viog £ z?{t) 



(s,x):z(s,x)<z 



\y(s,x):z(s,x)>z 



as t — > oo. Lemma tells us that if there is a mutation with growth rate z = 0(t 2 ), then 
the contribution from mutations with growth rates smaller than z — e can be ignored so it 
suffices to describe the distribution of the largest growth rates. We will show that 



n(z,oo) 



a4(7t/X )^V oUi exp( 7 Ao - 2X x/2c ) if z = c t 2 (l + ^ logt + 
if z > c t 2 (l + 



(2/3+1) log t 
A t 



(5-2) 

so that the largest growth rate is 0(t 2 ) and comes from the rightmost particle in the point 
process with intensity given by (II. 6p . 

To prove ( 15. 2p . we first need to locate the maximum of (p. Let z > Xot so that there exists 
a unique maximum s z . Solving 4> s (s, z) — and using the expression for <p s in (jl.4p yields 

s z = t — aoz 1 / 2 

where ao = (7/Ao) 1 ^ 2 = (4c ) -1 / 2 which leads to the expression 

z 

= X t - \ a z 1/2 - 'jz 1/2 /a + 7A0 
= A (t-2a 2; 1/2 )+7Ao. 

If we take 



{s z ,z) = X t-X (t- s z ) -7 



- Ar 



(5.3) 



cor 1 + 



K log t 
t 



X 



+ Cot 



t 

2a 



k log t Aaix 
1 + — — + 



t 



t 



in ( 15. 3 p and use (1 + y) 



1/2 



1 + y/2 + 0(y 2 ), we obtain 
A k logt 



2X a x + 7A0 + o(l) 



(5.4) 



as t —7- 00. Furthermore, (II. 5B implies that 

2jz x 



2 7 



(t 



3.1/2 



ar n z. 



x 



67^ 



67 

'~ 4 



as t —> 00 with a = 47/a 2 ,. Since (p s (s z , z) = 0, taking a Taylor expansion around s z yields 



{s,z x ) 



a 

2i {s 



+ 9{s,z x ) 



(5.5) 



21 



where \g(s, z)\ < C\s — s z \ 3 /t 2 for all s. Also note that letting 





we have 



so that 



^(s,z) = [ -j—^ - A 



tp(s Zx ,z x ) = ( — ^ A 

= 4 + o(z^) 
= (2c Q ft p + o{t p ) 

i/j(s,z x ) = (2cq)H p + g 2 (s,z x ) 



where \gz{s : z)\\s — s z \ H 13 = o(l). 
Write 

ij(s,z x )e^ Zx) ds= I tfj{s,z x )e 4 ' is ' Zx) ds+ I ^{s,z x )e^ Zx) ds 



where A = {s : \s — s Zx \ < C{t logt) 1 / 2 } fl [0,t]. Since concavity implies that for s G A c and 
C sufficiently large, we have 

exp(4>(s, z x )) < -^exp(4>(s Zx ,z x )) 

the contribution of the second integral is negligible. After the change of variables s = 
s Zx + (t/a) 1 ^ 2 r, when t is large, the first integral becomes 

/•C(log*)V2 

ip{s, z x )e^ s ^ ds = ((2cq)^ + o(l))e*<"»'*) / e s(^) e -f /2 (t/a) 1/2 dr. 



-C(logi)!/2 

and therefore since ^(s, 2^.)| < C(t logt) 3 / 2 /t 2 when s G A, we have 

oo) = V oUl [ tfj{s, z x )e^ s > Zx) ds ~ felW^V^' 2 ^ (5.6) 



where 6 = {2cof y/TIFJa = (2c ) /3 (7r/A ) 1/2 . Since 



we can conclude that 

fi{z x , oo) ->■ 



, KA logt 2 

v^**, 3e) = 2A a a; + 7A C 



VoMi6exp(7Ao — 2Aoao x ) = K) M i^ ex P(7^o — 2Aqx/2co) if K — 



if K > 4^ 

AO 

which proves (15. 2 p since this argument remains true even if k = n(t) and liminfK(t) > 

2/3+1 n 

Ao ' LJ 
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When a ^ 1, we no longer have an explicit formula for the maximum value s z which com- 
plicates the process of identifying the largest growth rate. We shall assume for convenience 
that a > is an integer. 

Proof of Theorem^ As in the proof of Theorem [6j it suffices to describe the distribution for 
the largest growth rates. Let z > \ot so the maximum s z exists. To find a useful expression 
for the value of <p(s z , z), we write 

(f)(s, z) = X t - X (t - s) - 7 (j^; - A 
Using the definition of s z as the solution to (p s (s z , z) — yields the condition that 

Ao 



i.e., 



• x l/(a+l) , , \ («-l)/(«+l) 



A / V 



If we substitute the right side of this equation back in for t — s z in the parenthesis, then 
writing a = (07/ Ao) 1 ^""^, we have 



a/(a + l) ( 1 _ XnanZ -l/(«+l) ( I _ X °(t~S z ) 



t-s z = a z" / ^ 1 > 1-A a 



z 



1 ^ ""I 
Q + l 



ao ^/(«+D I 1 _ A a z- 1/(Q+1) ( 1 - A a ^ 1/(a+1) f 1 A ° (t ~ S " U " ' ' 



a-1 

a — 1 \ "~~TT 
a-l\ — "TT \ a+1 



Z 

We repeat this a times and then use the approximation (1 — x) n = l — nx + 0(x 2 ) repeatedly 
with n — (a — l)/(oe + 1) to obtain 

t - s z = z a ^ ^~ 3l[a+1) + Oiz- 1 )^ (5.7) 

where 

/ X a (a - 1) v ' 
a j ~ a o I ~~\ 

V OL + 1 

for j > 1. The error term is 0(z~ r ) because 

0< (1- \ (t- s)/z) < 1 
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for all z > Xot and s < t. Factoring out ao in ( 15. 7p and using (1 + x) 1 = XX — X Y when 
\x\ < 1, we have that 



a 



t - S 



a = q v /a+1 i - j2 a o la ii z ~ ll/(a+1] + 



n~ 2 n- n- r -(n+»2)/(<*+ 1 ) 



ii=l ii,i2=l 



S° JJ ai .z-^=i^ /(Q+1) +0(z- 1 ) ) - A 2 

tl,...,ia=l j = l 



l/(a+l)^-l/(a+l) 



2 l/(o+l) 



^ & ^-i/(«+l) + ( 2 -l) j ( 5i 
0=0 



for large 2 where the bj are given by 



b = l/a 

h = -ax/al - A 

h = -(«2 - «?)/ a o 

6 3 = — (04 — 2aia3 — a| — 3a^a2 + Oi)/ao 



and in general, 



f)5.8p implies that 



k 



f.=E e (^)-" +,, n- 

fc=l ii,...,ijt:iiH hik=i 3=1 



-7 ( - A ) = - 7 z a /( Q + 1 ) (6« + o^-^z-VCa+i) 



a 



and therefore, 



(s z , z) = X t + A (t - s) - 7 ^^-7^ - ^0 

a 

= \ t + J2 d i z ^ +°( z ~ 1/ia+1) ) ( 5 -9) 
3=0 
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where the dj can be calculated explicitly, for example: 

d = -X a - 76° 

d\ = — \ a>i — 7o;6q _1 Ci 



d 2 = -X a 2 - 1 \ab a Q -%+\ )brX 



a 



d 3 = -Xa 3 - 7 ( ab«-% + ( ; )b«-%b 2 + ( ; )bt% ) . 



1 a— 3 l3 



To figure out the distribution of the growth rate for the largest mutant, we let Co 
(— \o/do)( a+1 ^ a and then search for Kj, j = 1, a — 1 and k so that plugging 



a-l 



/a , X . Klogt 



into ( 15.91) yields 

(p(s Zx ,z x ) = ki - k 2 x - k 3 \ogt (5.10) 

for some constants fci, &2, &3- Substituting z x into (15. 9p and writing Kq — 1, K a — x/cq to 
ease the notation we obtain 



j=o K ( " 



(a-j)/a ( a 



(a-j)/(a+l) 



^K j r j/a + Kt~ l \ogt 

vj=0 



+ 0(f 



Since Aot + g?o( — Aoi/do) = 0, the first order terms in this expansion is t^ a and after using 
the Taylor series expansion 



(1 + x ) p = 1 + px + p(p - l)x 2 /2 H h p(p - 1) • • • (p - a + l)x a /a\ + 0(a; 



we obtain 



where 



[s zo ,z ) = Y,P^ {a ~ ])/a + plogt + 0(r 1 / Q logt) 



P = d 
Pi = d 



p2 = do[ --r 



Ao 



/V 



a 



d J \a + 1 



Ao 
rf 



a: 



-c 2 + 



Ci + C?i 

a: 



a + 1 

\ \ (a-l)/a 
rf 



a + 1 







-1 c 



(a-l)/a 



a — 1 



a 



ci + 4 ( --p 

cio 



(a-2)/a 



(5.11) 
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and in general 



3 / \ \ {a-i)/a j-i k , v 

i=0 v u/ fc=l £=1 v 7 

j = 1,1,..., a where for each % and k, in the inner product, ii,...,ik are always chosen to 
satisfy i\ + z 2 + • • • + ik = j — i- Since pj depends only on n i} i < j, then after noting that the 
coefficient of Kj in pj is — a\o/(a + 1), we can use forward substitution to solve the system 
Pj = 0, j = 1, 2, a — 1 for Kj to obtain the recursive formulas 

a + 1 / — ctAo \ 

Cj = = -— T— - ~~TT K j ( 5 - 12 ) 
aAo \ a + L J 

for z = 1, 2, a — 1. Setting p = — fc 3 yields 

(a + l)fc 3 
K = \ 

and for this choice of Cj, k, we obtain ( 15. 101) with 

a d ( Aq \ ct\o 



k 



2 



a + 1 c \ do / (a; + l)c 
and fci = — (p a — k 2 x). Since 



2.r 



t~s Zx 



Ao =4 /(a+1) /aS + o« /(ttfl) ) 



c 



l/(a+l)\ 

^/a + o( ^/(a+l)> 



a 

choosing fc 3 = {2(3 /a + l)/2 replaces ( 15.41) in the proof of Theorem [6] 
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Now substituting (15. 7p and (15.81) in (II. 5p yields 



a 



a-2 



[s g ,z) = -a{a - 1) 7 zSt [ ^^-^+1) + O^z' 1 ) 



X 



z 2 



4a/(a+l) ^ =0 a i 2-i/(«+ 1 ) + O^ 1 )) 



a + 1 



0=0 

2z 



z 3a/(a+l) ^; =0 a i Z-i/(«+ 1 ) + CKZ- 1 )) 

= [-a(a - l)7&r 2 K - aib^/a^z-^^ + o(^- Q/(a+1) ) 

= _^ z -«/(«+l) +0 ( ;s -a/(a+ 1 ) ) 

where in the second to last line we have used the fact that 60 = % ■ When z = z x , this 
becomes 

<Pss(s Zx ,Z x ) = -- + o(t _1 ) 

where 

a 2 7 



_a+2 ' 



Since <f> s (s z ,z) = and a calculation similar to the one above shows that (j) SS s(s Zx , z x ) = 
0(t~ 2 ), we have 

<f>(s, z x ) = (j)(s Zx , z x ) - - (s - s Zx ) 2 + g(s Zx , z x ) 

where \g(s, z)\ < C\s — s z \ 3 /t 2 for all s. This replaces (15.51) from the a = 1 proof and the 
rest of the proof is the same. Note that the intensity for the limiting point process is given 
by 



a 



^/2ii/aexp(k 1 — k 2 x). (5.13) 



□ 

Remark 1. From (15. 7ft . we have 

t - s Zx ~ a (c & +1 ^) {a+1)/a = 

a + 1 

which tells us that the time at which the mutant with largest growth rate is born is ~ t/(a + l). 
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6 Discussion 



In this paper, we have analyzed a multi-type branching process model of tumor progression 
in which mutations increase the birth rates of cells by a random amount. We studied both 
bounded and unbounded distributions for the random fitness advances and calculated the 
asymptotic rate of expansion for the kth generation of mutants. 

In the bounded setting, we found that there are only two parameters of the distribution 
that affect the limiting growth rate of the kth generation (see Theorems dj HJ and [5]): the 
upper bound for the support of the distribution and the value of its density at the upper 
bound. This is a rather intuitive result since one would expect that in the long run, the kth 
generation will be dominated by mutants with the maximum possible fitness. In addition, we 
found that there is a polynomial correction to the exponential growth of the kth generation. 
This correction is not present in the case where the fitness advances are deterministic. We 
have discussed this point in further detail in Section 1.1 and after the proof of Theorem 
[5] in Section HI Finally, we showed that the limiting population is descended from several 
different mutations (see Theorem [3]). 

In the unbounded setting, we assumed that the distribution of the fitness advance has 
the form 

P(X >x) = x^e-^ 

where a, 0, and 7 are parameters. We found that the population of cells with a single 
mutation grows asymptotically at a super-exponential rate exp 

(j(a+i)/a) ( see Theorems M 

and [7]) and at large times, most of the first generation is derived from a single mutation (see 
Lemma [5]). The super-exponential growth rate suggests that the exponential distribution, 
which is often used for the fitness advances of an organism due to natural selection, is not a 
good choice for modeling the mutational advances in the progression to cancer where there 
is very little evidence for populations growing at a super-exponential rate. 

These conclusions provide several interesting contributions to the existing literature on 
evolutionary models of cancer progression. First, our model generalizes previous multi-type 
branching models of tumor progression by allowing for random fitness advances as mutations 
are accumulated and provides a mathematical framework for further investigations into the 
role played by the fitness distribution of mutational advances in driving tumorigenesis. Sec- 
ond, we have discovered that bounded distributions lead to exponential growth whereas 
unbounded distributions lead to super-exponential growth. This dichotomy might provide a 
new method for testing whether a tumor population has evolved with an unbounded distri- 
bution of mutational advances. Third, we observe that in the case of bounded distributions, 
the growth rate of the tumor is somewhat 'robust' with respect to the mutational fitness 
distribution and depends only on its upper endpoint. Finally, our calculations of the growth 
rates for the kth generation of mutants serve as a groundwork for studying the evolution and 
role of heterogeneity in tumorigenesis. These implications will be explored further in future 
work. 
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Figure 1: Plot of the exact Laplace transform (LT) for t( 1+p )e~( Ao+6 )'.Zi(t) at times t = 
60, 80, 100, 120, the approximations from Monte Carlo (MC) simulations at the corresponding 
times, and the asymptotic Laplace transform from Theorem [2j Parameter values: ao = 0.2, 
bo = 0.1, b = 0.01, and m = 10~ 3 . g is uniform on [0, .01]. 
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Figure 2: Plot of the approximations to the Laplace transform of t 2+P2 e _ ( A °+ 2,, ) f Z^if) from 
Monte Carlo (MC) simulations at times t = 80, 100, 120 along with the asymptotic Laplace 
transform from Theorem [5j Parameter values: ao = 0.2, bo = 0.1, b = 0.01, and u\ = U2 = 
10 -3 . g is uniform on [0,0.01]. 
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